Skip to content

Aggregate Functions#21

Open
gatesn wants to merge 20 commits intodevelopfrom
ngates/aggregates
Open

Aggregate Functions#21
gatesn wants to merge 20 commits intodevelopfrom
ngates/aggregates

Conversation

@gatesn
Copy link
Contributor

@gatesn gatesn commented Feb 28, 2026

No description provided.

Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
let agg = &options.aggregate_fn;

// Try encoding-specific fast path first.
if let Some(states) = list.elements().aggregate_list(&list, agg)? {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also the wrong type

Comment on lines +56 to +62
fn accumulate_list(&mut self, list: &ListViewArray) -> VortexResult<()> {
for i in 0..list.len() {
self.accumulate(&list.list_elements_at(i)?)?;
self.flush()?;
}
Ok(())
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might want to use a array + offset + len, approach to avoid list construction at each step?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean each step?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I way thinking as you do pushdown or reduce you will need to unwrap the elements, unwrap an encodings and wrap that up with offset + len

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that == canonicalize to ListView?

gatesn added 3 commits March 2, 2026 21:36
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
/// Merge a partial state scalar into the current group state.
fn merge(
&self, options: &Self::Options, state: &mut Self::GroupState, partial: &Scalar,
) -> VortexResult<()>;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you define merge in this way? It could be (GroupState, GroupState) -> GroupState

Copy link
Contributor Author

@gatesn gatesn Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because then you need an extra function for Scalar -> GroupState and also merging on multiple groups takes an ArrayRef, not a Vec

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand on this or did you define this else where?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain what the scalar partial is here and is it similar to a GroupState state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the "vortex" version of GroupState. We could just use a Scalar to model GroupState if we wanted. Maybe it's nicer to have a native type for performance. Or maybe it's ok to just use and merge scalars.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we definitely want a native type for GroupState, e.g. string_concat can hold a mutable string buffer and accumulate data into it. Then we only convert to scalar when we flush.


/// Accumulate a canonical batch into the current group state.
fn accumulate(
&self, options: &Self::Options, state: &mut Self::GroupState, batch: &Canonical,
Copy link

@joseph-isaacs joseph-isaacs Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the fallback and we have encoding specific kernels?

-> VortexResult<Self::GroupState>;

/// Accumulate a canonical batch into the current group state.
fn accumulate(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trying to pull out of stats happens here?

Signed-off-by: Nicholas Gates <nick@nickgates.com>
gatesn added a commit to vortex-data/vortex that referenced this pull request Mar 6, 2026
First PR implementing the Aggregate Functions proposal in
vortex-data/rfcs#21

---------

Signed-off-by: Nicholas Gates <nick@nickgates.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants